Secondary Structure Prediction of Large RNAs
نویسنده
چکیده
Generally, an understanding of RNA thermodynamics and folding kinetics can be derived from its secondary structure. Experiments have shown that alternative structures of an RNA can perform different functions. Thus, to understand the functionality of RNAs, a study of its folding and re-folding behavior is mandated. This work covers the Kinwalker and HelixPSO algorithms. Kinwalker predicts RNA folding trajectories, i.e. series of intermediate states connecting the initial structure with the predicted structure. The folding process is split into a series of interlaced folding and transcription events. The re-folding mechanism exploits that known metastable states apparently consist of energetically favorable combinations of locally optimal substructures. Kinwalker estimates the first passage times of folding events based on the energy barrier between successive structures. The energy barrier height is computed via a heuristic proposed by Morgan and Higgs as well as variations and extensions thereof. Kinwalker’s predictions show excellent qualitative agreement on a set of sequences with experimentally well-characterized folding pathways. The estimated folding times are mostly accurate. Kinwalker can compute RNAs of up to 1500 nucleotides, covering most RNAs for which kinetic effects are known to play a crucial role. Thus, Kinwalker can handle much longer sequences than other algorithms working at base pair step resolution and makes more accurate predictions. HelixPSO is a Particle Swarm Optimizer (PSO) algorithm, a biologically inspired optimization technique imitating swarm behavior. It simulates a clustered swarm of particles collectively exploring the fitness landscape of the RNA secondary structure space. The particles share knowledge about the search space with each other. The points in the space are the possible secondary structures of the input RNA, represented by a permutation of an ordering of the set of possible helices. Particle movement is conducted by transforming one conformation into another by swapping indices. The performance of HelixPSO – measured by free energy or number of correctly predicted base pairs – is compared to a set of algorithms implementing Dynamic Programming (RNAfold), Genetic Algorithm (RnaPredict), Simulated Annealing (SARNA-Predict) as well as PSO (SetPSO) methodologies. When free energy is minimized, HelixPSO consistently achieves lower values than RnaPredict and SetPSO. In base pair prediction, HelixPSO preforms close to RNAfold for average scores. For best values HelixPSO outperforms RNAfold by 9% in sensitivity, 18% in specificity and 13% for the F-measure. HelixPSO outperforms RnaPredict and significantly outperforms SetPSO. HelixPSO does almost as well as SARNA-Predict using the INN and INN-HB energy models. When compared to SARNA-Predict using the advanced efn2 energy model, SARNA-Predict clearly outperforms HelixPSO. For average instead of best values, HelixPSO performs significantly better than SARNA-Predict and SetPSO. PSO is a new approach to the RNA folding problem and HelixPSO performs
منابع مشابه
Prediction of Secondary Structure of Citrus Viroids Reported from Southern Iran
Abstract Viroids are smallest, single-stranded, circular, highly structured plant pathogenic RNAs that do not code for any protein. Viroids belong to two families, the Avsunviroidae and the Pospiviroidae. Members of the Pospiviroidae family adopt a rod-like secondary structure. In this study the most stable secondary structures of citrus viroid variants that reported from Fars province wer...
متن کاملSCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments
MOTIVATION The functions of non-coding RNAs are strongly related to their secondary structures, but it is known that a secondary structure prediction of a single sequence is not reliable. Therefore, we have to collect similar RNA sequences with a common secondary structure for the analyses of a new non-coding RNA without knowing the exact secondary structure itself. Therefore, the sequence comp...
متن کاملAn Alignment Algorithm by Matching Fixed-Length Stem Fragments for Comparing RNA Sequences
The functions of non-coding RNAs are strongly related to their secondary structures, but it is known that a secondary structure prediction of a single sequences is not reliable. Therefore, we have to collect similar RNA sequences with a common secondary structure for the analyses of a new non-coding RNA without knowing the exact secondary structure itself. Therefore, the sequence comparison in ...
متن کاملAutomated 3D structure composition for large RNAs
Understanding the numerous functions that RNAs play in living cells depends critically on knowledge of their three-dimensional structure. Due to the difficulties in experimentally assessing structures of large RNAs, there is currently great demand for new high-resolution structure prediction methods. We present the novel method for the fully automated prediction of RNA 3D structures from a user...
متن کاملProtein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملRNA secondary structure prediction using conditional random fields model
Non-coding RNAs (ncRNAs) have important biological functions in living cells dependent on their conserved secondary structures. Here, we focus on computational RNA secondary structure prediction by exploring primary sequences and complementary base pair interactions using the Conditional Random Fields (CRFs) model, which treats RNA prediction as a sequence labelling problem. Proposing suitable ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008